Method, apparatus and program storage device for scheduling the performance of maintenance tasks to maintain a system environment

ABSTRACT

A method, apparatus and program storage device for scheduling the performance of maintenance tasks to maintain a system environment is disclosed. A parameter for a computer system is monitored to detect a need to perform at least one maintenance task. At least one maintenance task is performed when the monitoring detects the need to perform at least one maintenance task or at least once within a predetermined period.

BACKGROUND OF THE INVENTION

1. Field of the Invention

This invention relates in general to maintaining a computer system, andmore particularly to a method, apparatus and program storage device forscheduling the performance of maintenance tasks to maintain a systemenvironment.

2. Description of Related Art

The defining environment for application development is changing.E-business applications are being used to leverage the Internet as aplatform for building and integrating applications over a networkenvironment. Further, application servers are moving from aprocessor-based operating system to an Internet-based operating systemor network computing. However, this term may be somewhat ambiguous.Nevertheless, this term is increasingly being used to refer to virtualapplications that are assembled from many different components that runon many machines across a network as if they were a single system.Components can range from entire centralized applications to singlemodules of a larger distributed application. Most important, it isbecoming clear that only virtual applications can deliver theflexibility to meet many of the lead-edge needs of today's corporateinformation systems. Accordingly, these application servers are beingused to integrate applications, business processes, and data. Toleverage investments, businesses are turning to open standard basedinfrastructures to allow solutions to be crafted based on the bestproducts for an application, process, etc. rather being locked into asingle-vendor solution.

However, such an open standard integrated solution throughout theenterprise requires tools and systems to accomplish the integration,connectivity, modeling, monitoring and management functions. People,processes, information, and systems must be integrated throughout theenterprise. Connectivity refers to connecting applications and systemsacross a company and to partners and customers. Modeling includes theability to model and simulate business processes to graphicallyrepresent the flow of work across people and application systems.Monitoring is provided by tracking business processes as they execute inapplications and systems across the enterprise. Management of theenterprise demands visualization of immediate operational results ofbusiness processes. This critical knowledge enables review of businessand system processes over a period of time so that bottlenecks andproblem areas can be identified and corrected.

In order to connect processes and to transform data, a diversecollection of functions must be provided. This diverse collection offunctions is referred to as middleware. The term middleware is aninclusive term that encompasses many disparate functions that do noteasily fit within other architectural components. Thus, middleware maybe considered an aggregation of distinct subcomponents. Middlewareprovides application services that were once written into applications.However, these services today are provided in an independentinfrastructure layer. Middleware enhances application integration byproviding uniform mechanisms to bridge ole and new technologies, or byenabling dissimilar elements to work together.

One type of middleware is message-oriented middleware. Message-orientedmiddleware allows application programs that may be distributed acrosssimilar or dissimilar platforms and/or network protocols to exchangedata with each other using messages and queues. In order to maintain aclean and efficient environment, certain maintenance procedures must berun regularly. Such maintenance tasks, however, are not easily performedby built-in utilities. In fact some important tasks cannot be performedat all by built-in utilities. Thus, the system administrator is leftwith the task of determining how to best carry out these tasks. Still,not all administrators are skilled in programming such tasks. Moreover,administrators may not know the optimum time to run maintenance tasks.For example, maintenance tasks may be scheduled to run too often.Because such maintenance tasks may be resource intensive, theperformance of the server may suffer. Alternatively, maintenance tasksmay not be scheduled often enough and therefore the system may not beoperating efficiently.

It can be seen then that there is a need for a method, apparatus andprogram storage device for scheduling the performance of maintenancetasks to maintain a system environment.

SUMMARY OF THE INVENTION

To overcome the limitations in the prior art described above, and toovercome other limitations that will become apparent upon reading andunderstanding the present specification, the present invention disclosesa method, apparatus and program storage device for scheduling theperformance of maintenance tasks to maintain a system environment.

The present invention solves the above-described problems by providing away for administrators can skillfully maintain servers. The presentinvention determines when to run maintenance operations based onpredetermined system criteria.

A method in accordance with an embodiment of the present inventionincludes monitoring a parameter for a computer system to detect a needto perform at least one maintenance task and performing at least onemaintenance task when the monitoring detects the need to perform atleast one maintenance task or at least once within a predeterminedperiod.

In another embodiment of the present invention, a system for schedulingthe performance of maintenance tasks to maintain a system environment isprovided. The system includes a maintenance tool for providing resourcesfor performing at least one maintenance task and a maintenancescheduling device for monitoring a parameter for a computer system todetect a need to perform at least one maintenance task and causing themaintenance tool to perform at least one maintenance task when themaintenance scheduling device detects the need to perform at least onemaintenance task or at least once within a predetermined period.

In another embodiment of the present invention, another system forscheduling the performance of maintenance tasks to maintain a systemenvironment is provided. This system includes means for providingresources for performing at least one maintenance task and means formonitoring a parameter for a computer system to detect a need to performat least one maintenance task and causing the means for providingresources to perform at least one maintenance task when the means formonitoring detects the need to perform at least one maintenance task orat least once within a predetermined period.

In another embodiment of the present invention, a program storage mediumtangibly embodying one or more programs of instructions executable bythe computer to perform a method for scheduling the performance ofmaintenance tasks to maintain a system environment is provided. Themethod includes monitoring a parameter for a computer system to detect aneed to perform at least one maintenance task and performing at leastone maintenance task when the monitoring detects the need to perform atleast one maintenance task or at least once within a predeterminedperiod.

These and various other advantages and features of novelty whichcharacterize the invention are pointed out with particularity in theclaims annexed hereto and form a part hereof. However, for a betterunderstanding of the invention, its advantages, and the objects obtainedby its use, reference should be made to the drawings which form afurther part hereof, and to accompanying descriptive matter, in whichthere are illustrated and described specific examples of an apparatus inaccordance with the invention.

BRIEF DESCRIPTION OF THE DRAWINGS

Referring now to the drawings in which like reference numbers representcorresponding parts throughout:

FIG. 1 illustrates a distributed computing environment according to anembodiment of the present invention;

FIG. 2 illustrates a middleware layer infrastructure according to anembodiment of the present invention;

FIG. 3 illustrates a system for scheduling the performance ofmaintenance tasks to maintain a system environment according to anembodiment of the present invention;

FIG. 4 illustrates examples of management tools according to anembodiment of the present invention;

FIG. 5 illustrates a flow chart of the maintenance program executionaccording to an embodiment of the present invention;

FIG. 6 illustrates a detailed flow chart for the maintenance andrecovery task operations according to an embodiment of the presentinvention; and

FIG. 7 illustrates the dynamic execution of the maintenance utilityaccording to an embodiment of the present invention.

DETAILED DESCRIPTION OF THE INVENTION

In the following description of the embodiments, reference is made tothe accompanying drawings that form a part hereof, and in which is shownby way of illustration the specific embodiments in which the inventionmay be practiced. It is to be understood that other embodiments may beutilized because structural changes may be made without departing fromthe scope of the present invention.

The present invention provides a method, apparatus and program storagedevice for scheduling the performance of maintenance tasks to maintain asystem environment. Administrators use the present invention toskillfully maintain servers. When to run maintenance operations isdetermined based on predetermined system criteria.

FIG. 1 illustrates a distributed computing environment 100 according toan embodiment of the present invention. FIG. 1 illustrates threecomputing domains 110-114. The three computing domains 110-114 includeat least one server 120 and at least one client 122 networked to theserver 120. The domains 110-114 are coupled through a network 130.Applications running in each of the domains 110-114 are usually builtusing centralized approaches, wherein the business rules and data thatcomprise the application reside on a single mainframe or network server,and lack ways to participate as components on the network. To preventisolation between the domains 110-114, middleware services 140 are usedto provide uniform mechanisms to bridge the different technologies andto enable dissimilar elements to work together. Middleware 140 comprisescomputer software that runs on a computer in at least one of theclusters 110-114.

FIG. 2 illustrates a middleware layer infrastructure 200 according to anembodiment of the present invention. In FIG. 2 the middleware layer 210provides an independent infrastructure layer that sits betweenapplications 212, networks 214, databases 216, and distributed servicescommunications mechanisms 218. The middleware layer 210 enhancesapplication integration by providing uniform mechanisms to bridgebetween the applications 212, networks 214, databases 216, anddistributed services communications mechanisms 218. Management of theinfrastructure is facilitated by the middleware layer 210 by allowingfor the centralized management of application services that are sharedby many applications. The middleware layer 210 also reduces the costsand complexity of implementing change in infrastructure services byenabling changes to be made across many applications, rather than toeach application that uses a service.

One type of middleware services that is used to ensure that theapplications and other functions are running smoothly is maintenanceutilities. For example, in a distributed computing environment, certainmaintenance procedures must be run regularly to ensure the system isoperating properly. For example, servers in a computing environment mustbe monitored to prevent crashes and/or to facilitate peak performance.Such maintenance tasks, however, are not easily performed by built-inutilities. In fact some important tasks cannot be performed at all bybuilt-in utilities. Thus, the system administrator is left with the taskof determining how to best carry out these tasks. Still, not alladministrators are skilled in programming such tasks. Moreover,administrators may not know the optimum time to run maintenance tasks.Thus, the present invention provides a maintenance tool to provide anadministrator the ability to determine when settings changed as well asthe option of restoring old settings through saved files.

FIG. 3 illustrates a system 300 for scheduling the performance ofmaintenance tasks to maintain a system environment according to anembodiment of the present invention. In FIG. 3, a central processingunit 310 may access memory 312 and runs an operating system 314. Theoperating system 314 may access application 320. The application 320includes instances 330-334. Master configuration file 344 identifieseach of the application instances configured on the system, as well asdefault settings for each instance. Each instance 330-334 may include anapplication instance-specific configuration file 340 and applicationinstance transaction logs 342.

The present invention includes a maintenance scheduling tool 350 toschedule maintenance tasks. Maintenance tasks are providing via themaintenance tools 352. The maintenance tools 352 maintain a maintenancelog file 354. The maintenance tools 352 are part of a larger suite oftools created to provide daily maintenance routines, as well as routinesthat copy settings to an archive location for later point-in-timerecovery, if necessary. A script is provided via the maintenancescheduling tool 350 to determine how often the maintenance tools 352 areto be performed. For example, administration setup maintenance may needto be performed nightly on all servers. However, some servers that areheavily used may need maintenance run more often than just once a day.To statically schedule run times, however, may be too little or toomuch.

In one embodiment of the present invention, the script provided in themaintenance scheduling tool 350 to control the maintenance tools 352 isa korn-shell script. A korn-shell script is a program that reads textualcommands from the user or from a file, converts them into operatingsystem commands, and executes them. A wrapper 356 sits on top of thescript in the maintenance scheduling tool 350 to dynamically determinewhether the maintenance tools 352 should be run. For example, thewrapper 356 may look at regular intervals such as every 30 minutes. Thewrapper 356 will cause the maintenance tools to be run at least every 24hours or other preset period, but also self-regulates in case logfilesystems are in danger of filling up. Thus, the wrapper 356 in themaintenance scheduling tool 350 monitors server conditions and runs themaintenance tools 352 whenever prudent. However, the present inventionis not meant to be limited to a particular period or pattern. In thismanner, the utilities 352 function as a “maintenance on demand” packageof tools. If the maintenance tools 352 are run too often therebyindicating some sort of problem on the server, an alert is generated andthe administrator is paged, for example, via email 360.

Some commands may be executed directly within the interpreter itself,e.g., setting variables or control constructs, while others may cause itto load and execute other files. Unix's command interpreters are knownas shells. The maintenance tools 352 employ several other externalprograms to complete its operation.

The cron scheduling utility 370 is an operating system scheduler thatmonitors for a set time and then executes a predetermined command. Forexample, the maintenance tools 352 may be scheduled to complete onceeach night and then log all operations to an output log. The maintenancetools 352 then may email 360 the administrator whenever it doesn'tcomplete successfully. This provides administrators with the ability tomanage problems before they grow and take down the server. For example,the maintenance tools may be used to clear unneeded log space.

Log files 358, such as the channel exit log file, are pruned to keepfrom growing too large. Log files 358 may have a preset size, and oncefull, existing files are copied to a backup file. Any “error” output maybe emailed to the administrator for review. Also, all program activitymay be logged to a file 358 for review; and to track whether maintenancewas started manually or via a cron scheduling utility.

A management console 380 provides an administrator access to and controlof the maintenance tools 352 and scheduler 350. The management console380 may include a display 382 for displaying status or other dataprovided by the maintenance tools 352.

FIG. 4 illustrates examples of management tools 400 according to anembodiment of the present invention. FIG. 4 illustrates the save command410, which is used to dump all configuration information to a text filefor reuse 412. The last two iterations may be kept actively, with olderversions retained as archives. Another tool is the built-in checkpointcommand 420 for all running queue managers that use linear logging. (Aqueue manager is an application instance for asynchronous messagingapplications such as IBM WebSphere MQ). The built-in checkpoint command420 causes all objects to be written to log for immediate recovery ifcorrupted 422.

A clean log command 430 may be provided for all running queue managersusing linear logging. The clean log script 430 looks in the error logsfor the latest transaction logs that are needed for full recovery 432.The clean log script 430 will then take older logs and zip them up toconserve space. A prune command 440 may be provided to prune oldcompressed logs from clean logs that are no longer needed. The prunecommand 440 keeps linear logging queue managers from taking over thefilesystem and filling it up 442. If linear logging queue managers areprovided and do not run this, the filesystem will eventually fill up andtake the queue manager down, causing an outage. The prune utility 440provides administrators with the ability to roll back to an earlierpoint in time, but also keeps file systems free of obsolete files.

The save authorizations script 450 is a command that should be runperiodically for all running queue managers. The save authorizationscript 450 provides recovery of application-based authorizations in casea queue manager needs to be rebuilt, or if moved to another box 452.Such scripts may be kept in a specified log file. For example, ten ofthe authorization scripts may be kept to provide 2.5 months' ability torecover should problems be encountered. Anarchive-old-configuration-files script 460 provides logs beyond thestandard two configuration files for each queue manager. Thearchive-old-configuration-files command 460 takes old configurationfiles periodically, and copies them to an archive directory 462. Forexample, if taken twice weekly, 50 of these archive files may be kept toprovide six full months recovery ability. Thearchive-old-configuration-files feature is especially helpful for queuemanagers whose definitions (configuration) change regularly, or wherenon-standard administrators may have access to alter objects, and arecord of it needs to be kept.

Those skilled in the art will recognize that the present invention isnot meant to be limited to the maintenance tasks illustrated above withreference to FIG. 4. Rather, embodiments in accordance with the presentinvention may run any type of maintenance tools based on systemcriteria. Further, by combining maintenance tasks, a maintenancescheduling tool according to embodiments of the present invention mayprovide a comprehensive maintenance routine. For example, thecombination of the save file command 410 and the save authorizationscript 450 allows a complete snapshot of activity that can later be usedto fix problems or restore configurations.

FIG. 5 illustrates a flow chart 500 of the maintenance program executionaccording to an embodiment of the present invention. In FIG. 5, the tooldetermines the operating-system type and tailors execution to thespecified environment 510. Application instances are then retrieved 512.For each instance, maintenance and recovery-preparation tasks areperformed 514. Errors during execution of the maintenance andrecovery-preparation tasks are monitored 516. A determination is madewhether errors occurred 520. If errors occurred 532, the administratoris alerted of the error condition 540. If not 534, the maintenanceprogram terminates 550. All execution activities are logged for laterreview 542.

FIG. 6 illustrates a detailed flow chart 600 for the maintenance andrecovery task operations (e.g., block 514 of FIG. 5) according to anembodiment of the present invention. In FIG. 6, the maintenance andrecovery task is initiated 610. A transaction log type is obtained 612.The application instance version and status is also obtained 614. Adetermination is made whether the instance is running 620 and whetherthe log type is linear or circular 630, 640. If the instance is notrunning 622 and the system is circular 632, the archived configurationand authorization files are cleaned-up 680 and the system returns tocheck for execution errors 690.

If the instance is running 624 and the system is circular 642, thecurrent configuration is exported to an archive 650. The currentauthorizations are also exported to an archive 660. Then, the archivedconfiguration and authorization files are cleaned-up 680 and the systemreturns to check for execution errors 690.

If the instance is running 624 and the system is linear 644, thetransaction checkpoint is recorded 670, e.g., the transaction log filesare updated for pruning. The old transaction logs are zipped and removed672. Because the instance is running, the current configuration isexported to an archive 650. The current authorizations are also exportedto an archive 660. Then, the archived configuration and authorizationfiles are cleaned-up 680 and the system returns to check for executionerrors 690.

If the instance is not running 622 and the system is linear 634, the oldtransaction logs are zipped and removed 674. Because the instance is notrunning 622, the archived configuration and authorization files aremerely cleaned-up 680 and the system returns to check for executionerrors 690. Those skilled in the art will recognize that the blocksdeciding whether the log type is linear or circular 630, 640 may be thesame process, and in a like manner so may the blocks where oldtransaction logs are zipped and removed 672, 674.

Furthermore, those skilled in the art will recognize that the presentinvention is not meant to be limited to the maintenance tasksillustrated above with reference to FIG. 6. Rather, embodiments inaccordance with the present invention may run any type of maintenancetools based on system criteria. Examples of additional maintenance tasksthat may be performed according to the maintenance scheduling tool ofthe present invention include, but are not limited to, providingdatabase consistency checks, ensuring database compaction, performingfull-text index generation, performing view recalculations, etc.

FIG. 7 illustrates the dynamic execution of the maintenance utility 700according to an embodiment of the present invention. In FIG. 7, the cronscheduling utility checks the wrapper utility according to a schedule710, e.g., every 30 minutes, once daily, at set times, etc., todetermine if the wrapper that manages execution of maintenance tools isrunning. (The cron utility only makes sure the wrapper is running; thewrapper utility then runs continuously and periodically checks whethermaintenance should be executed.) If it is not found to be running, thecron scheduling utility starts the wrapper, which then monitors the needfor maintenance in a dynamic, continuous fashion. The wrapper thendetermines whether the maintenance tools are already running 712. If themaintenance tools are actively executing currently, the wrapper will notstart parallel maintenance tasks. Next, the need for maintenance isascertained through checking current conditions such as free disk space714. A determination is made whether the maintenance utility should beperformed now 720 (for example, if maintenance tasks have not yet beenperformed today, if free disk space is getting too low, etc.). If not724, the wrapper utility sleeps again for a predetermined amount of time740, after which the process is repeated. If the maintenance utilityshould be run now 722, the maintenance utility is performed 730.

The process illustrated with reference to FIGS. 1-7 may be tangiblyembodied in a computer-readable medium or carrier, e.g. one or more ofthe fixed and/or removable data storage devices 388 illustrated in FIG.3, or other data storage or data communications devices. The computerprogram 390 may be loaded into memory 312 to configure the system 300for execution of the computer program 390. The computer program 390include instructions which, when read and executed by a processor, suchas central processing unit 310 of FIG. 1, causes the devices to performthe steps necessary to execute the steps or elements of an embodiment ofthe present invention.

Accordingly, embodiments of the present invention ensure that theservers are operating according to specifications. Referring again toFIG. 3, the maintenance tool 352 periodically performs maintenance tasksvia control of the maintenance scheduling device 350, and if any of thetasks cannot be accomplished, or if an error is detected, themaintenance tool alerts the system administrator. The frequency thatmaintenance scheduling device 350 causes the maintenance tool 352 to runis determined according to predetermined criteria, e.g., timing, systemconditions such as disk space availability, server usage, etc.

The foregoing description of the exemplary embodiment of the inventionhas been presented for the purposes of illustration and description. Itis not intended to be exhaustive or to limit the invention to theprecise form disclosed. Many modifications and variations are possiblein light of the above teaching. It is intended that the scope of theinvention be limited not with this detailed description, but rather bythe claims appended hereto.

1. A method for scheduling the performance of maintenance tasks tomaintain a system environment, comprising: monitoring a parameter for acomputer system to detect a need to perform at least one maintenancetask; and performing at least one maintenance task when the monitoringdetects the need to perform at least one maintenance task or at leastonce within a predetermined period.
 2. The method of claim 1, whereinthe monitoring a computer system parameter further comprises monitoringconditions on a server.
 3. The method of claim 2, wherein the performingat least one maintenance task when the monitoring detects the need toperform at least one maintenance task further comprises performing atleast one maintenance task when the server meets a predeterminedcriteria.
 4. The method of claim 3, wherein the predetermined criteriacomprises disk space for the server becoming too low.
 5. The method ofclaim 1, wherein the performing at least one maintenance task furthercomprises running a maintenance routine for improving computer systemoperation.
 6. The method of claim 1, wherein the performing at least onemaintenance task further comprises backing up settings to an archive. 7.The method of claim 6, wherein the backing up settings to an archivefurther comprises saving configurations and authorizations forapplications running on a server.
 8. The method of claim 1, wherein theperforming at least one maintenance task further comprises writing logfiles.
 9. The method of claim 1, wherein the performing at least onemaintenance task further comprises reducing a size for files stored onthe computer system.
 10. The method of claim 1 further comprisingalerting an administrator upon an occurrence of an event.
 11. The methodof claim 10, wherein the alerting an administrator upon an occurrence ofan event further comprises alerting an administrator when a maintenancetask fails.
 12. The method of claim 10, wherein the alerting anadministrator upon an occurrence of an event further comprises alertingan administrator when a maintenance task is ran too often.
 13. Themethod of claim 1, wherein the performing at least one maintenance taskat least once within a predetermined period further comprisingperforming at least one maintenance task once a day.
 14. A system forscheduling the performance of maintenance tasks to maintain a systemenvironment, comprising: a maintenance tool for providing resources forperforming at least one maintenance task; and a maintenance schedulingdevice for monitoring a parameter for a computer system to detect a needto perform at least one maintenance task and causing the maintenancetool to perform at least one maintenance task when the maintenancescheduling device detects the need to perform at least one maintenancetask or at least once within a predetermined period.
 15. The system ofclaim 14, wherein the computer system further comprises a server. 16.The system of claim 15, wherein the maintenance scheduling device causesthe maintenance tool to performs at least one maintenance task when theserver meets a predetermined criteria.
 17. The system of claim 16,wherein the predetermined criteria comprises disk space for the serverbecoming too low.
 18. The system of claim 14, wherein the at least onemaintenance task further comprises a maintenance routine for improvingcomputer system operation.
 19. The system of claim 14, wherein the atleast one maintenance task further comprises a backup of system settingsto an archive.
 20. The system of claim 19, wherein the system settingsfurther comprise configurations and authorizations for applicationsrunning on a server.
 21. The system of claim 14, wherein the at leastone maintenance task further comprises writing log files.
 22. The systemof claim 14, wherein the at least one maintenance task further comprisesreducing a size for files stored on the computer system.
 23. The systemof claim 14, wherein the maintenance tool alerts an administrator uponan occurrence of an event.
 24. The system of claim 23, wherein the eventfurther comprises a failure of a maintenance task.
 25. The system ofclaim 23, wherein the event further comprises a maintenance task runningtoo often.
 26. The system of claim 14, wherein the predetermined periodfurther comprises once a day.
 27. A system for scheduling theperformance of maintenance tasks to maintain a system environment,comprising: means for providing resources for performing at least onemaintenance task; and means for monitoring a parameter for a computersystem to detect a need to perform at least one maintenance task andcausing the means for providing resources to perform at least onemaintenance task when the means for monitoring detects the need toperform at least one maintenance task or at least once within apredetermined period.
 28. A program storage medium readable by acomputer, the medium tangibly embodying one or more programs ofinstructions executable by the computer to perform a method forscheduling the performance of maintenance tasks to maintain a systemenvironment, the method comprising: monitoring a parameter for acomputer system to detect a need to perform at least one maintenancetask; and performing at least one maintenance task when the monitoringdetects the need to perform at least one maintenance task or at leastonce within a predetermined period.